A Rule-Based Question Answering System For Reading Comprehension Tests

نویسندگان

  • Ellen Riloff
  • Michael Thelen
چکیده

We have developed a rule-based system, Quarc, that can reada short story and find the sentence in the story that best answers a given question. Quarc uses heuristic rules that look for lexical and semantic clues in the question and the story. We have tested Quarc on reading comprehension tests typically given to children in grades 3-6. Overall, Quarc found the correct sentence 40% of the time, which is encouraging given the simplicity of its rules. 1 I n t r o d u c t i o n In the United States, we evaluate the reading ability of children by giving them reading comprehension tests. These test typically consist of a short story followed by questions. Presumably, the tests are designed so that the reader must understand important aspects of the story to answer the questions correctly. For this reason, we believe that reading comprehension tests can be a valuable tool to assess the state of the art in natural language understanding. These tests are especially challenging because they can discuss virtually any topic. Consequently, broad-coverage natural language processing (NLP) techniques must be used. But the reading comprehension tests also require semantic understanding, which is difficult to achieve with broad-coverage techniques. We have developed a system called Quarc that "takes" reading comprehension tests. Given a story and a question, Quarc finds the sentence in the story that best answers the question. Quarc does not use deep language understanding or sophisticated techniques, yet it achieved 40% accuracy in our experiments. Quarc uses hand-crafted heuristic rules that look for lexical and semantic clues in the question and the story. In the next section, we describe the reading comprehension tests. In the following sections, we describe the rules used by Quarc and present experimental results. 2 R e a d i n g C o m p r e h e n s i o n T e s t s Figure 1 shows an example of a reading comprehension test from Remedia Publications. Each test is followed by five "WH" questions: WHO, WHAT, WHEN, WHERE, and WHY. 1 The answers to the questions typically refer to a string in the text, such as a name or description, which can range in length from a single noun phrase to an entire clause or sentence. The answers to WHEN and WHERE questions are also sometimes inferred from the dateline of the story. For example, (EGYPT, 1951) contains the answer to the WHEN question in Figure 1. Ideally, a natural language processing system would produce the exact answer to a question. Identifying the precise boundaries of the answer can be tricky, however. We will focus on the somewhat easier task of identifying the sentence that contains the answer to a question. 3 A R u l e b a s e d S y s t e m fo r Q u e s t i o n A n s w e r i n g Quarc (QUestion Answering for Reading Comprehension) is a rule-based system that uses lexical and semantic heuristics to look for evidence that a sentence contains the answer to a question. Each type of WH question looks for different types of answers, so Quarc uses a separate set of rules for each question type (WHO, WHAT,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

QArabPro: A Rule Based Question Answering System for Reading Comprehension Tests in Arabic

Problem statement: Extensive research efforts in the area of Natural Language Processing (NLP) were focused on developing reading comprehension Question Answering systems (QA) for Latin based languages such as, English, French and German. Approach: However, little effort was directed towards the development of such systems for bidirectional languages such as Arabic, Urdu and Farsi. In general, ...

متن کامل

Graph-based Word Clustering Applied to Question Answering and Reading Comprehension Tests

This paper describes our participation in the QA4MRE 2011 task, targeted at reading comprehension tests and multiple choice question answering. Our system constructs a co-occurrence graph with words that are common or proper nouns and verbs extracted from each document. The documents are pre-selected through an information retrieval process for recovering only those that are most relevant to a ...

متن کامل

The Effect of Iranian EFL Learners’ Self-generated vs. Group-generated Text-based Questions on their Reading Comprehension

Reading comprehension is one of the most important skills, especially in the EFL context. One way to improve reading comprehension is through strategy use. The present study aimed at investigating the effect of question-generation strategy on learners' reading comprehension. The participants in the study were 63 intermediate students from three intact groups in Resa institute in Boukan, They we...

متن کامل

Improving Question Answering for Reading Comprehension Tests by Combining Multiple Systems

Most work on reading comprehension question answering systems has focused on improving performance by adding complex natural language processing (NLP) components to such systems rather than by combining the output of multiple systems. Our paper empirically evaluates whether combining the outputs of seven such systems submitted as the final projects for a graduate level class can improve over th...

متن کامل

Semantic Answer Validation in Question Answering Systems for Reading Comprehension Tests

In this paper it is presented a methodology for tackling the problem of answer validation in question answering for reading comprehension tests. The implemented system accepts a document as input and it answers multiple choice questions about it based on semantic similarity measures. It uses the Lucene information retrieval engine for carrying out information extraction employing additional aut...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000